Precise Data Identification Services for Long Tail Research Data

نویسندگان

  • Stefan Pröll
  • Andreas Rauber
  • Kristof Meixner
چکیده

While sophisticated research infrastructures assist scientists in managing massive volumes of data, the so-called long tail of research data frequently suffers from a lack of such services. This is mostly due to the complexity caused by the variety of data to be managed and a lack of easily standardiseable procedures in highly diverse research settings. Yet, as even domains in this long tail of research data are increasingly data-driven, scientists need efficient means to precisely communicate, which version and subset of data was used in a particular study to enable reproducibility and comparability of result and foster data re-use. This paper presents three implementations of systems supporting such data identification services for comma separated value (CSV) files, a dominant format for data exchange in these settings. The implementations are based on the recommendations of the Working Group on Dynamic Data Citation of the Research Data Alliance (RDA). They provide implicit change tracking of all data modifications, while precise subsets are identified via the respective subsetting process. These enhances reproducibility of experiments and allows efficient sharing of specific subsets of data even in highly dynamic data settings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RADAR - A repository for long tail data

The way knowledge is shared is experiencing a paradigm shift: Digital networks allow new degrees of openness for research and its resources, accompanied by a huge potential for scientists, inventors, industry and the general public. Accessible data will allow all groups to participate in innovation and value creation regardless of their geographical location or individual background. However, f...

متن کامل

Semantic Retrieval Interface for Statistical Research Data

Statistical research data is the foundation for empirical studies. Researchers in economics or social sciences often obtain such data from external sources through specially designed retrieval interfaces from statistical offices, commercial data providers as well as from data agencies and other archives. With the advancements in data cataloguing and acquisition of long tail research data sets f...

متن کامل

Identification of the Healthcare Services with Potential Induced Demand

Background and Objectives: Induced demand in healthcare is referred to as provision of unnecessary services or the patient by health services providers, while the patient is not aware of their unnecessity. Apart from being unethical, this practice can potentially disturb the supply and demand balance in the health market, pose financial load on the patient, thread the patient’s health by imposi...

متن کامل

Status Report of bwFDM-Communities - A State Wide Research Data Management Initiative

Research data are valuable goods that are often only reproducible with significant effort or, in the case of unique observations, not at all. Scientists focus on data analysis and its results. By now, data exploration is accepted as a fourth scientific pillar (next to experiments, theory, and simulation). A main prerequisite for easy data exploration is successful data management. A holistic ap...

متن کامل

Research on Algorithm Recommended by Online Education for Big Data

“Big data” is becoming a hot topic in the Internet. The long tail problem of the massive online courses also becomes the biggest headache for operation team of online education. The manner in which the reader wants most courses show to be presented before the user is the key to improve the quality of online education. Personalized recommendation system is to discover the readers interests tende...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016